[OpenMP] Enable simd in non-reduction composite constructs #146097

mrkajetanp · 2025-06-27T15:52:21Z

Despite currently being ignored with a warning, simd as a leaf in composite constructs behaves as expected when the construct does not contain a reduction. Enable it for those non-reduction constructs.

Despite currently being ignored with a warning, simd as a leaf in composite constructs behaves as expected when the construct does not contain a reduction. Enable it for those non-reduction constructs. Signed-off-by: Kajetan Puchalski <[email protected]>

llvmbot · 2025-06-27T15:52:54Z

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-openmp

@llvm/pr-subscribers-mlir-llvm

Author: Kajetan Puchalski (mrkajetanp)

Changes

Despite currently being ignored with a warning, simd as a leaf in composite constructs behaves as expected when the construct does not contain a reduction. Enable it for those non-reduction constructs.

Full diff: https://github.com/llvm/llvm-project/pull/146097.diff

4 Files Affected:

(modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+2-5)
(added) mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir (+30)
(added) mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir (+33)
(modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (-14)

diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 23140f22555a5..de51ca49649f9 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -2847,11 +2847,8 @@ convertOmpSimd(Operation &opInst, llvm::IRBuilderBase &builder,
   llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder();
   auto simdOp = cast<omp::SimdOp>(opInst);
 
-  // TODO: Replace this with proper composite translation support.
-  // Currently, simd information on composite constructs is ignored, so e.g.
-  // 'do/for simd' will be treated the same as a standalone 'do/for'. This is
-  // allowed by the spec, since it's equivalent to using a SIMD length of 1.
-  if (simdOp.isComposite()) {
+  // TODO: Replace this once simd + reduction is properly supported
+  if (simdOp.isComposite() && simdOp.getReductionByref().has_value()) {
     if (failed(convertIgnoredWrapper(simdOp, moduleTranslation)))
       return failure();
 
diff --git a/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir b/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir
new file mode 100644
index 0000000000000..9ea0ff590144f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir
@@ -0,0 +1,30 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// Check that omp.simd as a leaf of a composite construct still generates
+// the appropriate loop vectorization attribute.
+
+// CHECK-LABEL: define internal void @test_parallel_do_simd..omp_par
+// CHECK: ![[VAL:.*]] = !{!"llvm.loop.vectorize.enable", i1 true}
+
+llvm.func @test_parallel_do_simd() {
+  %0 = llvm.mlir.constant(1 : i64) : i64
+  %1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+  %2 = llvm.mlir.constant(1000 : i32) : i32
+  %3 = llvm.mlir.constant(1 : i32) : i32
+  %4 = llvm.mlir.constant(1 : i64) : i64
+  omp.parallel {
+    %5 = llvm.mlir.constant(1 : i64) : i64
+    %6 = llvm.alloca %5 x i32 {bindc_name = "i", pinned} : (i64) -> !llvm.ptr
+    %7 = llvm.mlir.constant(1 : i64) : i64
+    omp.wsloop {
+      omp.simd {
+        omp.loop_nest (%arg0) : i32 = (%3) to (%2) inclusive step (%3) {
+          llvm.store %arg0, %6 : i32, !llvm.ptr
+          omp.yield
+        }
+      } {omp.composite}
+    } {omp.composite}
+    omp.terminator
+  }
+  llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir b/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir
new file mode 100644
index 0000000000000..108a99cf7c40f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir
@@ -0,0 +1,33 @@
+// RUN: mlir-translate --mlir-to-llvmir %s | FileCheck %s
+
+// Check that omp.simd as a leaf of a composite construct still generates
+// the appropriate loop vectorization attribute.
+
+// CHECK-LABEL: define internal void @test_teams_distribute_parallel_do_simd..omp_par
+// CHECK: ![[VAL:.*]] = !{!"llvm.loop.vectorize.enable", i1 true}
+
+omp.private {type = private} @_QFEi_private_i32 : i32
+llvm.func @test_teams_distribute_parallel_do_simd() {
+  %0 = llvm.mlir.constant(1 : i64) : i64
+  %1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+  %2 = llvm.mlir.constant(1000 : i32) : i32
+  %3 = llvm.mlir.constant(1 : i32) : i32
+  %4 = llvm.mlir.constant(1 : i64) : i64
+  omp.teams {
+    omp.parallel {
+      omp.distribute {
+        omp.wsloop {
+          omp.simd private(@_QFEi_private_i32 %1 -> %arg0 : !llvm.ptr) {
+            omp.loop_nest (%arg1) : i32 = (%3) to (%2) inclusive step (%3) {
+              llvm.store %arg1, %arg0 : i32, !llvm.ptr
+              omp.yield
+            }
+          } {omp.composite}
+        } {omp.composite}
+      } {omp.composite}
+      omp.terminator
+    } {omp.composite}
+    omp.terminator
+  }
+  llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir
index 97608ca3b4df1..c515e6e70b264 100644
--- a/mlir/test/Target/LLVMIR/openmp-todo.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir
@@ -26,20 +26,6 @@ llvm.func @atomic_hint(%v : !llvm.ptr, %x : !llvm.ptr, %expr : i32) {
 
 // -----
 
-llvm.func @do_simd(%lb : i32, %ub : i32, %step : i32) {
-  omp.wsloop {
-    // expected-warning@below {{simd information on composite construct discarded}}
-    omp.simd {
-      omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
-        omp.yield
-      }
-    } {omp.composite}
-  } {omp.composite}
-  llvm.return
-}
-
-// -----
-
 llvm.func @distribute_allocate(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) {
   // expected-error@below {{not yet implemented: Unhandled clause allocate in omp.distribute operation}}
   // expected-error@below {{LLVM Translation failed for operation: omp.distribute}}

llvmbot · 2025-06-27T15:52:55Z

@llvm/pr-subscribers-flang-openmp

Author: Kajetan Puchalski (mrkajetanp)

Changes

Despite currently being ignored with a warning, simd as a leaf in composite constructs behaves as expected when the construct does not contain a reduction. Enable it for those non-reduction constructs.

Full diff: https://github.com/llvm/llvm-project/pull/146097.diff

4 Files Affected:

(modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+2-5)
(added) mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir (+30)
(added) mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir (+33)
(modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (-14)

diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 23140f22555a5..de51ca49649f9 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -2847,11 +2847,8 @@ convertOmpSimd(Operation &opInst, llvm::IRBuilderBase &builder,
   llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder();
   auto simdOp = cast<omp::SimdOp>(opInst);
 
-  // TODO: Replace this with proper composite translation support.
-  // Currently, simd information on composite constructs is ignored, so e.g.
-  // 'do/for simd' will be treated the same as a standalone 'do/for'. This is
-  // allowed by the spec, since it's equivalent to using a SIMD length of 1.
-  if (simdOp.isComposite()) {
+  // TODO: Replace this once simd + reduction is properly supported
+  if (simdOp.isComposite() && simdOp.getReductionByref().has_value()) {
     if (failed(convertIgnoredWrapper(simdOp, moduleTranslation)))
       return failure();
 
diff --git a/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir b/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir
new file mode 100644
index 0000000000000..9ea0ff590144f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir
@@ -0,0 +1,30 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// Check that omp.simd as a leaf of a composite construct still generates
+// the appropriate loop vectorization attribute.
+
+// CHECK-LABEL: define internal void @test_parallel_do_simd..omp_par
+// CHECK: ![[VAL:.*]] = !{!"llvm.loop.vectorize.enable", i1 true}
+
+llvm.func @test_parallel_do_simd() {
+  %0 = llvm.mlir.constant(1 : i64) : i64
+  %1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+  %2 = llvm.mlir.constant(1000 : i32) : i32
+  %3 = llvm.mlir.constant(1 : i32) : i32
+  %4 = llvm.mlir.constant(1 : i64) : i64
+  omp.parallel {
+    %5 = llvm.mlir.constant(1 : i64) : i64
+    %6 = llvm.alloca %5 x i32 {bindc_name = "i", pinned} : (i64) -> !llvm.ptr
+    %7 = llvm.mlir.constant(1 : i64) : i64
+    omp.wsloop {
+      omp.simd {
+        omp.loop_nest (%arg0) : i32 = (%3) to (%2) inclusive step (%3) {
+          llvm.store %arg0, %6 : i32, !llvm.ptr
+          omp.yield
+        }
+      } {omp.composite}
+    } {omp.composite}
+    omp.terminator
+  }
+  llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir b/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir
new file mode 100644
index 0000000000000..108a99cf7c40f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir
@@ -0,0 +1,33 @@
+// RUN: mlir-translate --mlir-to-llvmir %s | FileCheck %s
+
+// Check that omp.simd as a leaf of a composite construct still generates
+// the appropriate loop vectorization attribute.
+
+// CHECK-LABEL: define internal void @test_teams_distribute_parallel_do_simd..omp_par
+// CHECK: ![[VAL:.*]] = !{!"llvm.loop.vectorize.enable", i1 true}
+
+omp.private {type = private} @_QFEi_private_i32 : i32
+llvm.func @test_teams_distribute_parallel_do_simd() {
+  %0 = llvm.mlir.constant(1 : i64) : i64
+  %1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+  %2 = llvm.mlir.constant(1000 : i32) : i32
+  %3 = llvm.mlir.constant(1 : i32) : i32
+  %4 = llvm.mlir.constant(1 : i64) : i64
+  omp.teams {
+    omp.parallel {
+      omp.distribute {
+        omp.wsloop {
+          omp.simd private(@_QFEi_private_i32 %1 -> %arg0 : !llvm.ptr) {
+            omp.loop_nest (%arg1) : i32 = (%3) to (%2) inclusive step (%3) {
+              llvm.store %arg1, %arg0 : i32, !llvm.ptr
+              omp.yield
+            }
+          } {omp.composite}
+        } {omp.composite}
+      } {omp.composite}
+      omp.terminator
+    } {omp.composite}
+    omp.terminator
+  }
+  llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir
index 97608ca3b4df1..c515e6e70b264 100644
--- a/mlir/test/Target/LLVMIR/openmp-todo.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir
@@ -26,20 +26,6 @@ llvm.func @atomic_hint(%v : !llvm.ptr, %x : !llvm.ptr, %expr : i32) {
 
 // -----
 
-llvm.func @do_simd(%lb : i32, %ub : i32, %step : i32) {
-  omp.wsloop {
-    // expected-warning@below {{simd information on composite construct discarded}}
-    omp.simd {
-      omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
-        omp.yield
-      }
-    } {omp.composite}
-  } {omp.composite}
-  llvm.return
-}
-
-// -----
-
 llvm.func @distribute_allocate(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) {
   // expected-error@below {{not yet implemented: Unhandled clause allocate in omp.distribute operation}}
   // expected-error@below {{LLVM Translation failed for operation: omp.distribute}}

mrkajetanp · 2025-06-27T15:57:39Z

Question for reviewers who have worked on this before - the current behaviour is fairly inconsistent, that is to say, using "simd reduction" on its own results in a crash because it's not implemented, whereas using "simd reduction" as part of a composite construct simply ignores the simd. Is it desirable to keep this behaviour? For instance, the following program will crash:

integer :: sum
sum = 0
!$omp simd reduction(+:sum)
do i=1, 1000
    sum = sum + 1
end do

While this one will only emit a warning.

integer :: sum
sum = 0
!$omp do simd reduction(+:sum)
do i=1, 1000
    sum = sum + 1
end do

There are several tests that rely on this, which is why I kept the current warning for composite constructs with reduction.

Additionally, if someone knows of any scenarios which I've overlooked that this might break for, please point them out. I was expecting something else to require fixing in order to do this, but I've not found anything that breaks so far.

tblah

Please could you elaborate on the testing you've done to be sure this is correctly supported.

mrkajetanp · 2025-06-27T21:22:04Z

I ran both gfortran and Fujitsu test suites (alongside the LLVM one) to make sure this does not introduce regressions in those. For a number of test Fortran programs using composite constructs with simd, I manually inspected the IR and the final binary to make sure that with this change, vectorisable instructions inside these constructs turn into vectorised instructions even with -fno-vectorize, where without this change they do not.

tblah · 2025-06-30T09:10:39Z

I believe the reason for not applying the "not yet implemented" message to simd reduction for composite constructs is that if the reduction clause is used on the composite construct, that reduction is logically applied to most of the constructs comprising the composite construct. So disabling simd reduction would disable reduction for any composite construct containing simd.

SIMD is unusual because not implementing the reduction won't produce incorrect results (except some subtle effects around the re-presentable range of numbers) - it is just less likely to vectorize. I will post a patch implementing this soon so I don't think we need to worry.

mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp

mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir

skatrak · 2025-06-30T12:47:09Z

Question for reviewers who have worked on this before - the current behaviour is fairly inconsistent, that is to say, using "simd reduction" on its own results in a crash because it's not implemented, whereas using "simd reduction" as part of a composite construct simply ignores the simd. Is it desirable to keep this behaviour? For instance, the following program will crash:

I was actually planning on proposing changes to this soon, so I can just explain from my side what the thinking behind some of these differences are. Currently, we entirely ignore simd within a composite construct, which means that we don't check whether there are unimplemented simd clauses being specified. In the case of standalone simd, though, there is some existing support, so there are checks for unimplemented features. The result of that is that do simd reduction(+: x) compiles and runs as if it was just do reduction(+: x), whereas simd reduction(+: x) triggers a not-yet-implemented error for the reduction clause.

The reason for ignoring simd information on composite constructs was to prevent the introduction of simd from causing not-yet-implemented errors when it would be legal by the spec to always codegen for SIMD lanes of 1 element, especially when there are reductions involved, as @tblah described. So, composite constructs behave as if they always resulted in 1-element SIMD lanes. This is very arbitrary, so I like your proposal to handle simd in composite constructs. At the moment I'm most concerned about what to do about unsupported clauses.

How much of an issue would it be to always emit warnings instead of errors for all unsupported simd clauses? Does this become an issue only after we generate 2+ element SIMD lanes? I'm thinking that from the target side we don't currently want to be doing anything with simd constructs, so we might still want to have the option to ignore them.

mrkajetanp requested review from skatrak, jsjodin and tblah June 27, 2025 15:52

llvmbot added mlir:llvm mlir mlir:openmp flang:openmp labels Jun 27, 2025

tblah reviewed Jun 27, 2025

View reviewed changes

tblah reviewed Jun 30, 2025

View reviewed changes

mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp Show resolved Hide resolved

mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir Outdated Show resolved Hide resolved

Expand tests

4f3d361

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[OpenMP] Enable simd in non-reduction composite constructs #146097

[OpenMP] Enable simd in non-reduction composite constructs #146097

mrkajetanp commented Jun 27, 2025

Uh oh!

llvmbot commented Jun 27, 2025 •

edited

Loading

Uh oh!

llvmbot commented Jun 27, 2025

Uh oh!

mrkajetanp commented Jun 27, 2025

Uh oh!

tblah left a comment

Uh oh!

mrkajetanp commented Jun 27, 2025

Uh oh!

tblah commented Jun 30, 2025

Uh oh!

Uh oh!

Uh oh!

skatrak commented Jun 30, 2025

Uh oh!

Uh oh!

[OpenMP] Enable simd in non-reduction composite constructs #146097

Are you sure you want to change the base?

[OpenMP] Enable simd in non-reduction composite constructs #146097

Conversation

mrkajetanp commented Jun 27, 2025

Uh oh!

llvmbot commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 27, 2025

Uh oh!

mrkajetanp commented Jun 27, 2025

Uh oh!

tblah left a comment

Choose a reason for hiding this comment

Uh oh!

mrkajetanp commented Jun 27, 2025

Uh oh!

tblah commented Jun 30, 2025

Uh oh!

Uh oh!

Uh oh!

skatrak commented Jun 30, 2025

Uh oh!

Uh oh!

llvmbot commented Jun 27, 2025 •

edited

Loading